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Abstract 

This  paper  summarizes  the  results  of  a  DoD  Systems  Engineering  Research  Center  (SERC)  project  to 
synthesize  analyses  of  DoD  SE  effectiveness  risk  sources  into  a  lean  framework  and  toolset  for  early 
identification  of  SE-related  program  risks.  It  includes  concepts  of  operation  which  enable  project 
sponsors  and  performers  to  agree  on  the  nature  and  use  of  more  effective  evidence-based  reviews. 
These  enable  early  detection  of  missing  SE  capabilities  or  personnel  competencies  with  respect  to  a 
framework  of  Goals,  Critical  Success  Factors  (CSFs),  and  Questions  determined  from  leading  DoD 
early-SE  CSF  analyses.  The  SE  Effectiveness  Measurement  (EM)  tools  enable  risk-based  prioritization 
of  corrective  actions,  as  shortfalls  in  evidence  for  each  question  are  early  uncertainties,  which  when 
combined  with  the  relative  system  impact  of  a  negative  answer  to  the  question,  translates  into  the  degree 
of  risk  that  needs  to  be  managed  to  avoid  system  overruns  and  incomplete  deliveries. 


Introduction;  Motivation  and  Context 

DoD  programs  need  effective  systems  engineering  (SE)  to  succeed. 

DoD  program  managers  need  early  warning  of  any  risks  to  achieving  effective  SE. 

This  SERC  project  has  synthesized  analyses  of  DoD  SE  effectiveness  risk  sources  into  a  lean  framework 

and  toolset  for  early  identification  of  SE-related  program  risks. 

Three  important  points  need  to  be  made  about  these  risks. 

•  They  are  generally  not  indicators  of  "bad  SE."  Although  SE  can  be  done  badly,  more  often  the  risks 
are  consequences  of  inadequate  program  funding  (SE  is  the  first  victim  of  an  underbudgeted 
program),  of  misguided  contract  provisions  (when  a  program  manager  is  faced  with  the  choice 
between  allocating  limited  SE  resources  toward  producing  contract-incentivized  functional 
specifications  vs.  addressing  key  performance  parameter  risks,  the  path  of  least  resistance  is  to  obey 
the  contract),  or  of  management  temptations  to  show  early  progress  on  the  easy  parts  while 
deferring  the  hard  parts  till  later. 
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•  Analyses  have  shown  that  unaddressed  risk  generally  leads  to  serious  budget  and  schedule  overruns. 

•  Risks  are  not  necessarily  bad.  If  an  early  capability  is  needed,  and  the  risky  solution  has  been 
shown  to  be  superior  to  the  alternatives,  accepting  and  focusing  on  mitigating  the  risk  is  generally 
better  than  waiting  for  a  better  alternative  to  show  up. 

Unlike  traditional  schedule -based  and  event-based  reviews,  the  SERC  SE  EM  technology  enables 
sponsors  and  performers  to  agree  on  the  nature  and  use  of  more  effective  evidence-based  reviews. 
These  enable  early  detection  of  missing  SE  capabilities  or  personnel  competencies  with  respect  to  a 
framework  of  Goals,  Critical  Success  Factors  (CSFs),  and  Questions  determined  by  the  EM  task  from 
the  leading  DoD  early-SE  CSF  analyses.  The  EM  tools  enable  risk-based  prioritization  of  corrective 
actions,  as  shortfalls  in  evidence  for  each  question  are  early  uncertainties,  which  when  combined  with 
the  relative  system  impact  of  a  negative  answer  to  the  question,  translates  into  the  degree  of  risk  that 
needs  to  be  managed  to  avoid  system  overruns  and  incomplete  deliveries. 

The  EM  tools’  definition  of  “SE  effectiveness”  is  taken  from  the  INCOSE  definition  of  SE  as  “an 
interdisciplinary  approach  and  means  to  enable  the  realization  of  successful  systems.”  Based  on  this 
definition,  the  SERC  project  proceeded  to  identify  and  organize  a  framework  of  SE  effectiveness 
measures  (EMs)  that  could  be  used  to  assess  the  evidence  that  a  MDAP’s  SE  approach,  current  results, 
and  personnel  competencies  were  sufficiently  strong  to  enable  program  success.  Another  component  of 
the  research  was  to  formulate  operational  concepts  that  would  enable  MDAP  sponsors  and  performers  to 
use  the  EMs  as  the  basis  of  collaborative  formulation,  scoping,  planning,  and  monitoring  of  the 
program’s  SE  activities,  and  to  use  the  monitoring  results  to  steer  the  program  toward  the  achievement 
of  feasible  SE  solutions. 


Technical  Approach 


The  EM  research  project  reviewed  over  two-dozen  sources  of  candidate  SE  EMs,  and  converged  on  the 
strongest  sources  to  be  used  to  identify  candidate  SE  EMs.  We  developed  a  coverage  matrix  to 
determine  the  envelope  of  candidate  EMs,  and  the  strength  of  consensus  on  each  candidate  EM.  It  fed 
the  results  back  to  the  source  originators  to  validate  the  coverage  matrix  results.  This  resulted  in  further 
insights  and  added  candidate  EMs  to  be  incorporated  into  an  SE  Performance  Risk  Framework.  The 
resulting  framework  is  organized  into  a  hierarchy  with  4  Goals,  18  Critical  Success  Factors,  and  74 
Questions  that  appeared  to  cover  the  central  core  of  common  SE  performance  determinants  of  SE 
effectiveness. 

Concurrently,  the  research  project  was  extended  to  also  assess  SE  personnel  competency  as  a 
determinant  of  program  success.  We  analyzed  an  additional  six  personnel  competency  risk  frameworks 
and  sets  of  questions.  Their  Goals  and  Critical  Success  Factors  were  very  similar  to  those  used  in  the  SE 
Performance  Risk  Framework,  although  the  Questions  were  different.  The  resulting  SE  Competency 
Risk  Framework  added  one  further  Goal  of  Professional  and  Interpersonal  Skills  with  five  Critical 
Success  Factors,  resulting  in  a  framework  of  5  Goals,  23  Critical  Success  Factors,  and  81  Questions. 
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Our  initial  research  focused  on  identifying  methods  that  might  be  suitable  for  assessing  the  effectiveness 
of  systems  engineering  on  major  defense  acquisition  programs  (MDAPs).  A  literature  review  identified 
eight  candidate  measurement  methods:  the  NRC  Pre-Milestone  A  &  Early-Phase  SysE  top-20  checklist 
[20];  the  Air  Force  Probability  of  Program  Success  (PoPS)  Framework  [1];  the  INCOSE/LMCO/MIT 
Leading  Indicators  [24];  the  Stevens  Leading  Indicators  (new;  using  SADB  root  causes)  [34];  the  USC 
Anchor  Point  Feasibility  Evidence  criteria  [31];  the  UAH  teaming  theories  criteria  [14];  the  NDIA/SEI 
capability /challenge  criteria  [15];  and  the  SISAIG  Early  Warning  Indicators  [9]  incorporated  into  the 
USC  Macro  Risk  Tool  [33] . 

Pages  5-8  of  the  NRC  report  [20]  suggests  a  “Pre-Milestone  A/B  Checklist”  for  judging  the  successful 
completion  of  early-phase  systems  engineering.  Using  this  checklist  as  a  concise  starting  point,  we 
identified  similar  key  elements  in  each  of  the  other  candidate  measurement  methods,  resulting  in  a 
coverage  matrix  with  a  list  of  45  characteristics  of  effective  systems  engineering.  Figure  1  shows  the 
first  page  of  the  coverage  matrix.  We  then  had  the  originators  of  the  measurement  methods  indicate 
where  they  felt  the  coverage  matrix  was  inaccurate  or  incomplete.  This  assessment  also  identified 
another  six  EM  characteristics  not  previously  noted. 

Figure  1.  EM  Coverage  Matrix 
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Previous  research  by  the  USC  team  into  a  macro-risk  model  for  large-scale  projects  had  resulted  in  a 
taxonomy  of  high-level  goals  and  supporting  critical  success  factors  (CSFs)  based  on  [28].  This  was 
identified  as  a  potential  framework  for  organizing  the  5 1  EM  characteristics  identified  above.  Analysis 
of  the  characteristics  showed  that  they  could  be  similarly  organized  into  a  series  of  four  high-level  goals, 
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each  containing  4-5  CSFs,  as  seen  in  Figure  2.  Our  survey  of  the  existing  literature  suggests  that  these 
CSFs  are  among  the  factors  that  are  most  critical  to  successful  SE,  and  that  the  degree  to  which  the  SE 
function  in  a  program  satisfies  these  CSFs  is  a  measure  of  SE  effectiveness. 

Figure  2.  Goals  and  CSFs  for  SE  Performance 
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Related  to  the  effectiveness  measures  of  SE  performance  is  the  need  to  measure  the  effectiveness  of  the 
staff  assigned  to  the  SE  function.  Besides  the  eight  SEPRT  sources,  six  additional  sources  were 
reviewed  for  contributions  to  Personnel  Competency  evidence  questions:  the  Office  of  the  Director  of 
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National  Intelligence  (ODNI),  Subdirectory  Data  Collection  Tool:  Systems  Engineering  [22];  the 
INCOSE  Systems  Engineering  Handbook,  August  2007  [17];  the  ASN  (RD&A),  Guidebook  for 
Acquisition  of  Naval  Software  Intensive  Systems,  September  2008  [3];  the  CMU/SEI,  Models  for 
Evaluating  and  Improving  Architecture  Competence  report  [4];  the  NASA  Office  of  the  Chief  Engineer, 
NASA  Systems  Engineering  Behavior  Study,  October  2008  [34];  and  the  National  Research  Council, 
Human-System  Integration  in  the  System  Development  Process  report,  2007  [23]. 

These  were  analyzed  for  candidate  knowledge,  skills,  and  abilities  (KSA)  attributes  proposed  for 
systems  engineers.  Organizing  these  work  activities  and  KSAs  revealed  that  the  first  four  goals  and 
their  CSFs  were  in  common  with  the  EM  taxonomy.  This  is  shown  in  Figure  3,  which  shows  the 
compatibility  of  the  four  goals  in  the  EM  taxonomy  with  the  first  four  goals  in  the  National  Defense 
Industry  Association’s  SE  Personnel  Competency  framework  and  those  in  the  CMU/SEI  Models  for 
Evaluating  and  Improving  Architecture  Competence  report. 

Figure  3.  Comparison  of  EM  Competency  Framework  with  NDIA  and  SEI  Counterparts 


SERC  EM  Framework 

NDIA  Personnel 
Competency  FW 

SEI  Architect 
Competency  FW 

Concurrent  Definition  of 
System  Requirements  & 

Sol  utions 

Systems  Thinking 

Stakeholder  Interaction 

System  Life  Cycle 
Organization,  Planning, 
Staffing 

Life  Cycle  View 

Other  phases 

Technology  Maturingand 
Architecting 

SE  Technical 

Architecting 

Evidence-Based  Progress 
Monitoring  &  Commitment 
Reviews 

SE  Technical 
Management 

Management 

Professional/  Interpersonal 
(added) 

Professiona  1/ 
Interpersonal 

Leadership,  Communication, 
Interpersonal 

As  one  might  expect,  the  two  competency  frameworks  also  had  a  fifth  goal  emphasizing  professional 
and  interpersonal  competencies.  Drawing  on  these  and  the  other  Personnel  Competency  sources  cited 
above,  an  additional  goal  and  its  related  CSFs  were  added  for  the  EM  Competency  framework,  as 
presented  in  Figure  4. 


Figure  4.  Additional  goals  and  CSFs  for  SE  competency 
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High-level  Goal 

Critical  Success  Factors 

Ability  to  plan,  staff,  organize,  team-build,  control, 
and  direct  systems  engineering  teams 

Ability  to  work  with  others  to  negotiate,  plan,  execute, 
and  coordinate  complementary  tasks  for  achieving 
program  objectives 

Professional  and 
interpersonal  skills 

Ability  to  perform  timely,  coherent,  and  concise 
verbal  and  written  communication 

Ability  to  deliver  on  promises  and  behave  ethically 

Ability  to  cope  with  uncertainty  and  unexpected 
developments,  and  to  seek  help  and  fill  relevant 
knowledge  gaps 

Question-Level  Impact/Evidence  Ratings  and  Project  SE  Risk  Assessment 

Using  these  relatively  high-level  criteria,  however,  it  is  difficult  to  evaluate  whether  the  SE  on  a 
particular  program  adequately  satisfies  the  CSFs.  In  its  approach  to  evaluating  macro-risk  in  a  program, 
[31]  suggests  that  a  goal-question-metric  (GQM)  approach  [4]  provides  a  method  to  accomplish  this 
evaluation.  Following  this  example,  we  developed  questions  to  explore  each  goal  and  CSF,  and  devised 
metrics  to  determine  the  relevance  of  each  question  and  the  quality  of  each  answer. 

The  researchers  began  question  development  for  the  SE  performance  framework  with  the  checklist  from 
[20].  Further  questions  were  adapted  from  the  remaining  EM  characteristics,  rewritten  as  necessary  to 
express  them  in  the  form  of  a  question.  Each  question  is  phrased  such  that,  answered  affirmatively,  it 
indicates  positive  support  of  the  corresponding  CSF.  Thus,  the  strength  of  support  for  each  answer  is 
related  to  the  relative  risk  probability  associated  with  the  CSF  that  question  explores. 

Rather  than  rely  simply  on  the  opinion  of  the  evaluator  as  to  the  relative  certainty  of  positive  SE 
performance,  a  stronger  and  more  quantifiable  evidence-based  approach  was  selected.  The  strength  of 
the  response  is  related  to  the  amount  of  evidence  available  to  support  an  affirmative  answer — the 
stronger  the  evidence,  the  lower  the  risk  probability.  Feedback  from  industry,  government,  and  academic 
participants  in  workshops  conducted  in  March  and  May  2009  suggested  that  a  simple  risk  probability 
scale  with  four  discrete  values  be  employed  for  this  purpose. 

Evidence  takes  whatever  form  is  appropriate  for  the  particular  question.  For  example,  a  simulation 
model  might  provide  evidence  that  a  particular  performance  goal  can  be  met.  Further,  the  strongest 
evidence  is  that  which  independent  expert  evaluators  have  validated. 

Recognizing  that  each  characteristic  might  be  more  or  less  applicable  to  a  particular  program  being 
evaluated,  the  questions  are  also  weighted  according  to  the  risk  impact  that  failure  to  address  the 
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question  might  be  expected  to  have  on  the  program.  Again  based  on  workshop  feedback,  a  four-value 
scale  for  impact  was  chosen. 

The  product  of  the  magnitude  of  a  potential  loss  (the  risk  impact)  and  the  likelihood  of  that  loss  (the  risk 
probability)  is  the  risk  exposure.  Although  risk  exposure  is  generally  calculated  given  quantitative  real- 
number  estimates  of  the  magnitude  and  probabilities  of  a  loss,  the  assessments  of  risk  impact  and  risk 
probability  described  above  use  an  ordinal  scale.  However,  as  shown  in  the  tool  below,  we  have 
associated  quantitative  ranges  of  loss  magnitude  and  loss  probability  with  the  rating  levels,  providing  a 
quantitative  basis  for  a  mapping  between  the  four-value  risk  probability  and  risk  impact  scales  to  a 
discrete  five- value  risk  exposure  scale. 


Prototype  SE  Effectiveness  Risk  Tools 

As  a  means  to  test  the  utility  of  these  characteristics  for  assessing  systems  engineering  effectiveness, 
using  the  GQM  approach  outlined  above,  the  researchers  created  prototype  tools  that  might  be  used  to 
perform  periodic  evaluations  of  a  project,  similar  to  a  tool  used  in  conjunction  with  the  macro-risk 
model  described  above.  The  following  section  describes  this  prototype  implementation  in  further  detail. 


SE  Performance  Risk  Tool 

The  Systems  Engineering  Performance  Risk  Tool  (SEPRT)  is  an  Excel  spreadsheet-based  prototype 
focused  on  enabling  projects  to  determine  their  relative  risk  exposure  due  to  shortfalls  in  their  SE 
performance  relative  to  their  prioritized  project  needs.  It  complements  other  SE  performance 
effectiveness  assessment  capabilities  such  as  the  INCOSE  Leading  Indicators,  in  that  it  supports  periodic 
assessment  of  evidence  of  key  SE  function  performance,  as  compared  to  supporting  continuous 
assessment  of  key  project  SE  quantities  such  as  requirements  volatility,  change  and  problem  closure 
times,  risk  handling,  and  staffing  trends. 

The  operational  concept  of  the  SEPRT  tool  is  to  enable  project  management  (generally  the  Project 
Manager  or  his/her  designate)  to  prioritize  the  relative  impact  on  the  particular  project  of  shortfalls  in 
performing  the  SE  task  represented  in  each  question.  Correspondingly,  the  tool  enables  the  project 
systems  engineering  function  (generally  the  Chief  Engineer  or  Chief  Systems  Engineer  or  their 
designate)  to  evaluate  the  evidence  that  the  project  has  adequately  performed  that  task.  This 
combination  of  impact  and  risk  assessment  enables  the  tool  to  estimate  the  relative  project  risk  exposure 
for  each  question,  and  to  display  them  in  a  color-coded  Red- Yellow-Green  form. 

These  ideas  were  reviewed  in  workshops  with  industry,  government,  and  academic  participants 
conducted  in  March  and  May  2009,  with  respect  to  usability  factors  in  a  real  project  environment.  A 
consensus  emerged  that  the  scale  of  risk  impact  and  risk  probability  estimates  should  be  kept  simple  and 
easy  to  understand.  Thus  a  red,  yellow,  green,  and  grey  scale  was  suggested  to  code  the  risk  impact;  and 
a  corresponding  red,  yellow,  green,  and  blue  scale  to  code  the  risk  probability.  These  scales  are 
discussed  in  more  depth  below.  An  example  of  the  rating  scales,  questions,  and  calculated  risk  exposure 
in  the  prototype  tool  is  presented  in  Figure  5  below. 
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NOTE:  Impact  and  evidence/risk  ratings  should  be  done  independently.  The 
impact  rating  should  estimate  the  effect  a  failure  to  address  the  specified  item 
might  have  on  the  program.  The  evidence  rating  should  specify  the  qualtity  of 
evidence  that  has  been  provided,  which  demonstrates  that  the  specified  risk  Risk 
item  has  been  satisfactorily  addressed.  Exposure 


Goal  1: 


Concurrent  definition  of  system  requirements  and  solutions 


Understanding  of  stakeholder  needs:  capabilities,  operational  concept,  key 
performance  parameters,  enterprise  fit  (legacy) 


At  Milestone  A,  have  the  KPPs  been  identified  in  clear,  comprehensive,  concise  terms  that 
are  understandable  to  all  stakeholders? 

Has  a  CONOPS  been  developed  showing  that  the  system  can  be  operated  to  handle  both 
nominal  and  off-nominal  workloads,  to  meet  response  time  requirements,  and  generally 
to  meet  the  defined  KPPs? 

Has  the  ability  of  the  system  to  meet  mission  effectiveness  goals  been  verified  through 
the  use  of  modeling  and  simulation? 


Have  the  success-critical  stakeholders  been  identified,  their  roles  and  responsibilities 
negotiated,  and  their  needs  clearly  represented  by  the  KPPs  and  CONOPS? 


Have  issues  about  the  fit  of  the  system  into  the  stakeholders'  context  --  acquirers,  end 
users,  administrators,  interoperators,  maintainers,  etc.  --  been  adequately  explored? 


Figure  5.  The  SEPRT  Tool  Seeks  Performance  Evidence 


Risk  impact  ratings  vary  from  a  critical  impact  (40-100%;  average  70%  cost-schedule-capability 
shortfall)  in  performing  the  SE  task  in  question  (red)  through  significant  impact  (  20-40%;  average  30% 
shortfall:  yellow)  and  moderate  impact  (2-20%;  average  11%  shortfall:  green)  to  little-no  impact  (0-2%; 
average  1%  shortfall:  gray).  These  relative  impact  ratings  enable  projects  to  tailor  the  evaluation  to  the 
project’s  specific  situation.  Thus,  for  example,  it  is  easy  to  “drop”  a  question  by  clicking  on  its  “No 
Impact”  button,  but  also  easy  to  restore  it  by  clicking  on  a  higher  impact  button.  The  rating  scale  for  the 
impact  level  is  based  on  the  user’s  chosen  combination  of  effects  on  the  project’s  likely  cost  overrun, 
schedule  overrun,  and  missing  percent  of  promised  over  actual  delivered  capability  (considering  there 
are  various  tradeoffs  among  these  quantities). 

Using  Question  1.1(a)  from  5  as  an  example,  if  the  project  were  a  back-room  application  for  base 
operations  with  no  mission-critical  key  performance  parameters  (KPPs),  its  impact  rating  would  be 
Little-No  impact  (Gray).  However,  if  the  project  were  a  C4ISR  system  with  several  mission-critical 
KPPs,  its  rating  would  be  Critical  impact  (Red). 

The  Evidence/Risk  rating  is  the  project’s  degree  of  evidence  that  each  SE  effectiveness  question  is 
satisfactorily  addressed,  scored  (generally  by  the  project  Chief  Engineer  or  Chief  Systems  Engineer  or 
their  designate)  on  a  risk  probability  scale:  the  less  evidence,  the  higher  the  probability  of  shortfalls.  As 
with  the  Impact  scale,  the  Evidence  scale  has  associated  quantitative  ratings:  Little  or  No  Evidence:  P  = 
0.4  -  1.0;  average  0.7;  Weak  Evidence:  P  =  0.2-  0.4;  average  0.3;  Partial  Evidence:  P  =  0.02  -  0.2; 
average  0.11;  Strong  Evidence:  P  =  0  -  0.02;  average  0.01. 

Again,  using  Question  1.1(a)  from  Figure  as  an  example  analyzing  a  C4ISR  system  with  several 
mission-critical  KPPs,  then  a  lack  of  evidence  (from  analysis  of  current-system  shortfalls  and/or  the  use 
of  operational  scenarios  and  prototypes)  that  its  “KPPs  had  been  identified  at  Milestone  A  in  clear, 
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comprehensive,  concise  terms  that  are  understandable  to  the  users  of  the  system”  would  result  in  a  High 
risk  probability,  while  strong  and  externally  validated  evidence  would  result  in  a  Very  Low  risk 
probability. 

Using  the  average  probability  and  impact  values  presented  above,  the  average-valued  Risk  Exposure  = 
P(Risk)  *  Size(Risk)  relative  to  100%  implied  by  the  ratings  is  presented  in  Figure  6.  The  SEPRT  tool 
provides  a  customizable  mapping  of  each  impact/probability  pair  to  a  color-coded  risk  exposure,  based 
on  the  above  table.  For  each  question,  the  risk  exposure  level  is  determined  by  the  combination  of  risk 
impact  and  risk  probability,  and  a  corresponding  risk  exposure  color-coding  is  selected,  which  ranges 
from  red  for  the  highest  risk  exposure  to  green  for  the  lowest.  Figure  6  the  default  color-coding  used  in 
the  SEPRT  tool;  an  additional  Excel  sheet  in  the  tool  enables  users  to  specify  different  color  codings. 


Figure  6.  Average  risk  exposure  calculation  and  default  color  code 


Impact  //  Probability 

Very  Low 

Low 

Medium 

High 

Critical 

0.7 

7.7 

Significant 

0.3 

3.3 

7.7 

7.7 

Moderate 

0.11 

1.21 

3.3 

Little-No  Impact 

0.01 

0.11 

0.3 

0.7 

As  seen  in  5,  the  risk  exposure  resulting  from  scoring  the  impact  and  risk  of  each  question  is  presented 
in  the  leftmost  column.  Based  on  suggestions  from  workshop  participants,  the  current  version  of  the  tool 
assigns  the  highest  risk  exposure  level  achieved  by  any  of  the  questions  in  a  CSF  as  the  risk  exposure  for 
the  overall  CSF.  This  maximum  risk  exposure  presented  in  the  rightmost  column  for  the  CSF.  This 
rating  method  has  the  advantages  of  being  simple  and  conservative,  but  might  raise  questions  if,  for 
example,  CSF  1.1  were  given  a  red  risk  exposure  level  for  one  red  and  four  greens,  and  a  yellow  risk 
exposure  level  for  five  yellows.  Experience  from  piloting  of  the  tool  has  suggested  refinements  to  this 
approach,  discussed  later  in  this  report. 


SE  Competency  Risk  Tool 

The  initial  section  of  the  Systems  Engineering  Competency  Risk  Tool  (SECRT)  is  shown  in  Figure  7.  It 
functions  in  the  same  way  as  the  SEPRT  tool  described  above,  but  its  questions  address  key 
considerations  of  personnel  competency  for  each  CSF.  The  space  limitations  of  this  paper  preclude 
showing  all  of  the  SEPRT  and  SECRT  questions  corresponding  to  the  goals  and  CSF.  They  are 
provided  in  the  downloadable  tools  and  SERC  EM  project  Final  Technical  Report  [361  at  the  SERC  web 
site  at  TBD. 
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NOTE:  Impact  and  evidence/risk  ratings  should  be  done  independently.  The 
impact  rating  should  estimate  the  effect  a  failure  to  competently  address  the 
specified  item  might  have  on  the  program.  The  competency  rating  should 
specify  the  observed,  historical  experience  and  competency  of  the  systems 
engineering  staff  on  past  programs  with  respect  to  the  specified  risk  item. 


Reset 


Risk 
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Goal  1: 


Critical  Success  Factor  1.1 


Concurrent  definition  of  system  requirements  and  solutions 

Understanding  of  stakeholder  needs:  capabilities,  operational  concept,  key 
performance  parameters,  enterprise  fit  (legacy).  Evidence  of  ability  to  analyze 
strengths  and  shortfalls  in  current-system  operations  via: 

Participatory  workshops,  surveys,  focus  groups? 

Operations  research  techniques:  operations  data  collection  and  analysis? 

Mission  effectiveness  modeling  and  simulation? 

Prototypes,  scenarios,  stories,  personas? 

Ethnographic  techniques:  Interviews,  sampled  observations,  cognitive  task  analysis? 


SEPRT  and  SECRT  Concepts  of  Operation 

The  SEPRT  and  SECRT  framework  and  tools  provide  a  way  for  projects  to  identify  the  major  sources  of 
program  risk  due  to  SE  shortfalls.  This  section  summarizes  concepts  of  operation  for  applying  the  tools 
at  major  milestones,  and  at  other  points  where  SE  demonstration  shortfalls  or  other  SE  EMs  such  as  the 
INCOSE  Leading  Indicators  have  identified  likely  problem  situations  and  need  further  understanding  of 
the  problem  sources  and  their  relative  degrees  of  risk.  More  detail  and  examples  are  provided  in  the  SE 
EM  technical  report  [36]. 

The  first  step  in  the  concept  of  operations  involves  collaborative  planning  by  a  project’s  sponsoring 
decision  authority  (at  a  developer  level,  a  program  level,  or  a  program  oversight  level)  and  its 
performing  organization(s)  to  reach  agreements  on  the  relative  priorities  of  its  needed  performance 
aspects  and  personnel  competencies,  as  measured  by  the  relative  program  impact  of  their  SEPRT  and 
SECRT  question  content.  The  planning  necessarily  includes  consideration  of  the  consistency  of  these 
priorities  with  the  project’s  SE  budget,  schedule,  and  contract  provisions.  This  stage  frequently 
identifies  inconsistencies  between  sponsor  priorities  (e.g.,  early  key  performance  parameter  (KPP) 
tradeoff  analysis)  and  contract  provisions  (e.g.,  progress  payments  and  award  fees  initially  focused  on 
functional  specifications  and  not  KPP  satisfaction),  and  enables  their  timely  resolution. 

The  next  step  involves  evaluation  via  independent  experts  of  the  evidence  of  adequacy  provided  by  the 
performers  for  their  ability  to  perform  the  desired  levels  of  SE  performance  within  the  budgets, 
schedules,  and  staffing  defined  in  their  SE  plans.  This  should  include  the  rationale  for  the  choices  of 
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evidence  preparation  such  as  prototyping,  modeling,  simulation,  benchmarking,  exercising,  testbed 
preparation,  scenario  generation,  instrumentation  and  data  analysis;  and  their  associated  resource 
consumption.  Subsequently,  progress  with  respect  to  the  plans  needs  to  be  monitored;  it  is  best  to 
consider  the  planned  evidence  as  a  first  class  deliverable,  and  to  include  its  progress  measurement  in  the 
project’s  earned  value  monitoring  system. 

Finally,  the  completed  evidence  needs  to  be  provided  by  the  performers  at  each  major  milestone  review, 
along  with  the  performers’  rating  of  the  strength  of  the  evidence  and  the  associated  risk  level  indicated 
by  the  SEPRT  and  SECRT  tools.  These  ratings  are  then  evaluated  by  independent  experts,  and  adjusted 
where  the  evidence  is  stronger  or  weaker  than  the  indicated  ratings.  The  revised  ratings  are  discussed 
and  iterated  by  the  sponsors  and  performers,  and  used  to  determine  revised  SEPRT  and  SECRT  risk 
levels.  These  will  enable  to  sponsors  and  performers  to  determine  the  necessary  risk  mitigation  plans, 
budgets,  and  schedules  for  ensuring  project  success.  Again,  more  detailed  scenarios  and  flowcharts  are 
provided  in  the  SE  EM  technical  report  [36]. 


Summary  of  Framework  and  Tool  Evaluations 

We  solicited  pilot  evaluations  of  the  EM  performance  and  competency  frameworks,  using  the  prototype 
SEPRT  and  SECRT  tools,  from  industry,  government  agencies,  and  academic  participants.  Because  the 
task  re-scoping  permitted  only  a  single  round  of  piloting,  these  initial  evaluations  were  conducted 
against  historical  projects  and  case  studies.  The  tools  were  successfully  piloted  against  five  DoD 
projects,  one  NASA  project,  and  one  commercial  project.  They  were  also  analyzed  by  two  industrially- 
experienced  colleagues  against  detailed  case  studies  of  a  number  of  DoD  and  commercial  projects.  The 
application  domains  piloted  included  space,  medical  systems,  logistics,  and  systems-of-sy stems.  Results 
of  the  pilot  evaluations  were  reported  through  a  web-based  survey  tool  and  detailed  follow-up 
interviews,  while  the  case  study  evaluations  were  reported  through  detailed  comments  from  the 
reviewers. 

Evaluations  were  generally  positive,  and  the  frameworks  were  found  to  be  useful  across  all  project 
phases  except  Production,  and  against  all  systems  types  except  “legacy  development.”  The  consensus  of 
reviewers  was  that  the  frameworks  would  be  most  useful  in  the  System  Development  &  Demonstration 
(SDD)  phase,  and  generally  more  useful  in  early  phases  than  later.  It  was  noted,  however,  that  in 
systems  developed  using  evolutionary  strategies,  such  “early”  phases  recur  throughout  the  development 
cycle,  extending  the  usefulness  of  the  frameworks.  The  evaluations  were  reported  to  take  2-5  hours  to 
complete  for  persons  familiar  with  the  projects,  with  materials  that  were  readily  at  hand.  Also,  in 
reviewing  case  study  material,  some  evaluators  reported  that  the  EM  framework  was  not  specific  to  any 
particular  problem  domain  (a  choice  we  made  to  make  domain  tailoring  user-performable  via  Excel). 

Several  evaluators  reported  that  the  frameworks  generated  too  many  high-risk  findings,  which  might 
make  the  results  too  overwhelming  to  take  action.  In  response  to  this  significant  concern,  the  impact 
scales  were  adjusted  to  make  the  adjectives  better  correspond  to  the  quantitative  impacts  (Critical- 
Significant-Moderate-Little  or  No  vs.  High-Medium-Low-No  impact),  and  a  longer  risk  exposure  scale 
developed  to  allow  more  nuanced  results. 

In  addition,  the  University  of  Maryland  (UMD)  Fraunhofer  Center  (FC)  performed  preliminary 
evaluations  against  the  Systemic  Analysis  Database  (SADB),  compiled  by  OUSD  (AT&L),  and  a 
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mapping  between  the  SEPRT  questions  and  the  Defense  Acquisition  Program  Support  (DAPS) 
methodology  underlying  the  SADB  results.  This  evaluation  approach  allowed  analysis  of  the 
effectiveness  of  the  frameworks  with  respect  to  historical  success  and  failures  of  the  subject  projects, 
and  another  cross  check  of  the  SEPRT  coverage.  Overall,  the  coverage  mapping  indicated  that  the  two 
were  largely  consistent,  with  more  domain  coverage  in  the  DAPS  methodology.  A  similar  mapping  was 
performed  between  the  SECRT  and  the  Defense  Acquisition  University’s  SPRDE-SE/PSE  Competency 
Model,  with  similar  results. 

Further,  a  business  case  analysis  for  the  investment  in  SE  effectiveness  evidence  was  performed,  based 
on  data  from  161  software-intensive  systems  projects  used  to  calibrate  the  COCOMO  II  cost  estimation 
model  and  its  Architecture  and  Risk  Resolution  scale  factor.  It  concluded  that  the  greater  the  project’s 
size,  criticality,  and  stability  are,  the  greater  is  the  need  for  validated  architecture  feasibility  evidence 
(i.e.,  evidence-based  specifications  and  plans).  However,  for  very  small,  low-criticality  projects  with 
high  volatility,  the  evidence  generation  efforts  would  make  little  difference  and  would  need  to  be 
continuously  redone,  producing  a  negative  return  on  investment.  In  such  cases,  agile  methods  such  as 
rapid  prototyping,  Scrum  and  extreme  Programming  will  be  more  effective.  Overall,  evidence-based 
specifications  and  plans  will  not  guarantee  a  successful  project,  but  in  general  will  eliminate  many  of  the 
software  delivery  overruns  and  shortfalls  experienced  on  current  software  projects.  Again,  more  details 
are  provided  in  the  SE  EM  technical  report  [36]. 


Conclusions 

DoD  programs  need  effective  systems  engineering  (SE)  to  succeed. 

DoD  program  managers  need  early  warning  of  any  risks  to  achieving  effective  SE. 

This  SERC  project  has  synthesized  the  best  analyses  of  DoD  SE  effectiveness  risk  sources  into  a  lean 

framework  and  toolset  for  early  identification  of  SE-related  program  risks. 

Three  important  points  need  to  be  made  about  these  risks. 

•  They  are  generally  not  indicators  of  "bad  SE."  Although  SE  can  be  done  badly,  more  often  the  risks 
are  consequences  of  inadequate  program  funding  (SE  is  the  first  victim  of  an  underbudgeted 
program),  of  misguided  contract  provisions  (when  a  program  manager  is  faced  with  the  choice 
between  allocating  limited  SE  resources  toward  producing  contract-incentivized  functional 
specifications  vs.  addressing  key  performance  parameter  risks,  the  path  of  least  resistance  is  to  obey 
the  contract),  or  of  management  temptations  to  show  early  progress  on  the  easy  parts  while 
deferring  the  hard  parts  till  later. 

•  Analyses  have  shown  that  unaddressed  risk  generally  leads  to  serious  budget  and  schedule  overruns. 

•  Risks  are  not  necessarily  bad.  If  an  early  capability  is  needed,  and  the  risky  solution  has  been 
shown  to  be  superior  to  the  alternatives,  accepting  and  focusing  on  mitigating  the  risk  is  generally 
better  than  waiting  for  a  better  alternative  to  show  up. 

The  results  of  the  SEPRT  and  SECRT  pilot  assessments,  the  DAPS  and  SADB  comparative  analysis, 

and  the  quantitative  business  case  analysis  for  the  use  of  the  SE  EM  framework,  tools,  and  operational 
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concepts  is  sufficiently  positive  to  conclude  that  implementation  of  the  approach  is  worth  pursuing. 
Presentations  at  recent  workshops  have  generated  considerable  interest  in  refining,  using,  and  extending 
the  capabilities  and  in  co-funding  the  followon  research.  However,  the  framework  and  prototype  tools 
have  been  shown  to  be  largely  efficacious  only  to  date  for  pilot  projects  done  by  familiar  experts  in  a 
relatively  short  time.  It  remains  to  demonstrate  how  well  the  framework  and  tools  will  perform  on  in- 
process  MDAPs  with  multiple  missions,  performers,  and  independent  expert  assessors. 

Some  implications  of  defining  feasibility  evidence  as  a  “first  class”  project  deliverable  are  that  it  needs 
to  be  planned  (with  resources),  and  made  part  of  the  project’s  earned  value  management  system.  Any 
shortfalls  in  evidence  are  sources  of  uncertainty  and  risk,  and  should  be  covered  by  risk  management 
plans.  The  main  contributions  of  the  SERC  SE  EM  project  have  been  to  provide  experience-based 
approaches  and  operational  concepts  for  the  use  of  evidence  criteria,  evidence-generation  procedures, 
and  SE  effectiveness  measures  for  monitoring  evidence  generation,  which  support  the  ability  to  perform 
evidence-based  SE  on  DoD  MDAPs.  And  finally,  evidence-based  specifications  and  plans  such  as  those 
provided  by  the  SERC  SE  EM  capabilities  and  the  Feasibility  Evidence  Description  can  and  should  be 
added  to  traditional  milestone  reviews. 

As  a  bottom  line,  the  SERC  SE  capabilities  have  strong  potential  for  transforming  the  largely 
unmeasured  DoD  SE  activity  content  on  current  MDAPs  and  other  projects  into  an  evidence-based 
measurement  and  management  approach  for  both  improving  the  outcomes  of  current  projects,  and  for 
developing  a  knowledge  base  that  can  serve  as  a  basis  for  continuing  DoD  SE  effectiveness 
improvement. 
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